NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Unified Binary and Multiclass Margin-Based Classification

Wang, Yutong; Scott, Clayton (May 2024, Journal of Machine Learning Research)

The notion of margin loss has been central to the development and analysis of algorithms for binary classification. To date, however, there remains no consensus as to the analogue of the margin loss for multiclass classification. In this work, we show that a broad range of multiclass loss functions, including many popular ones, can be expressed in the relative margin form, a generalization of the margin form of binary losses. The relative margin form is broadly useful for understanding and analyzing multiclass losses as shown by our prior work (Wang and Scott, 2020, 2021). To further demonstrate the utility of this way of expressing multiclass losses, we use it to extend the seminal result of Bartlett et al. (2006) on classification calibration of binary margin losses to multiclass. We then analyze the class of Fenchel-Young losses, and expand the set of these losses that are known to be classification-calibrated.
more » « less
Full Text Available
Neural Collapse in Multi-label Learning with Pick-all-label Loss

Li, Pengyu; Li, Xiao; Wang, Yutong; Qu, Qing (June 2024, International Conference on Machine Learning)

We study deep neural networks for the multi-label classification (MLab) task through the lens of neural collapse (NC). Previous works have been restricted to the multi-class classification setting and discovered a prevalent NC phenomenon comprising of the following properties for the last-layer features: (i) the variability of features within every class collapses to zero, (ii) the set of feature means form an equi-angular tight frame (ETF), and (iii) the last layer classifiers collapse to the feature mean upon some scaling. We generalize the study to multi-label learning, and prove for the first time that a generalized NC phenomenon holds with the "pick-all-label'' formulation, which we term as MLab NC. While the ETF geometry remains consistent for features with a single label, multi-label scenarios introduce a unique combinatorial aspect we term the "tag-wise average" property, where the means of features with multiple labels are the scaled averages of means for single-label instances. Theoretically, under proper assumptions on the features, we establish that the only global optimizer of the pick-all-label cross-entropy loss satisfy the multi-label NC. In practice, we demonstrate that our findings can lead to better test performance with more efficient training techniques for MLab learning.
more » « less
Full Text Available
UV-VIS chip-scale spectropolarimeter

https://doi.org/10.1117/12.3013574

Kim, Juhyeon; Chen, Jiyi; Li, Pengyu; Wang, Yutong; Qu, Qing; Ku, Pei-Cheng (June 2024, SPIE)
Crocombe, Richard A; Barnett, Steven M; Profeta, Luisa_T M (Ed.)
Full Text Available
Predicting Performance and Accuracy of Mixed-Precision Programs for Precision Tuning

https://doi.org/10.1145/3597503.3623338

Wang, Yutong; Rubio-González, Cindy (February 2024, ACM)

Full Text Available
Sim2Real in reconstructive spectroscopy: Deep learning with augmented device-informed data simulation

https://doi.org/10.1063/5.0209339

Chen, Jiyi; Li, Pengyu; Wang, Yutong; Ku, Pei-Cheng; Qu, Qing (August 2024, APL Machine Learning)

This work proposes a deep learning (DL)-based framework, namely Sim2Real, for spectral signal reconstruction in reconstructive spectroscopy, focusing on efficient data sampling and fast inference time. The work focuses on the challenge of reconstructing real-world spectral signals in an extreme setting where only device-informed simulated data are available for training. Such device-informed simulated data are much easier to collect than real-world data but exhibit large distribution shifts from their real-world counterparts. To leverage such simulated data effectively, a hierarchical data augmentation strategy is introduced to mitigate the adverse effects of this domain shift, and a corresponding neural network for the spectral signal reconstruction with our augmented data is designed. Experiments using a real dataset measured from our spectrometer device demonstrate that Sim2Real achieves significant speed-up during the inference while attaining on-par performance with the state-of-the-art optimization-based methods.
more » « less
On Classification-Calibration of Gamma-Phi Losses

Wang, Yutong; Scott, Clayton (July 2023, Proeedings of Machine Learning Research)

Gamma-Phi losses constitute a family of multiclass classification loss functions that generalize the logistic and other common losses, and have found application in the boosting literature. We establish the first general sufficient condition for the classification-calibration (CC) of such losses. To our knowledge, this sufficient condition gives the first family of nonconvex multiclass surrogate losses for which CC has been fully justified. In addition, we show that a previously proposed sufficient condition is in fact not sufficient. This contribution highlights a technical issue that is important in the study of multiclass CC but has been neglected in prior work.
more » « less
Full Text Available
Foundation Intelligence for Smart Infrastructure Services in Transportation 5.0

https://doi.org/10.1109/TIV.2023.3349324

Han, Xu; Meng, Zonglin; Xia, Xin; Liao, Xishun; He, Brian Yueshuai; Zheng, Zhaoliang; Wang, Yutong; Xiang, Hao; Zhou, Zewei; Gao, Letian; et al (January 2024, IEEE Transactions on Intelligent Vehicles)

Full Text Available
Consistent Interpolating Ensembles via the Manifold-Hilbert Kernel

Wang, Yutong; Scott, Clayton (January 2022, Neural Information Processing Systems 2022)

Recent research in the theory of overparametrized learning has sought to establish generalization guarantees in the interpolating regime. Such results have been established for a few common classes of methods, but so far not for ensemble methods. We devise an ensemble classification method that simultaneously interpolates the training data, and is consistent for a broad class of data distributions. To this end, we define the manifold-Hilbert kernel for data distributed on a Riemannian manifold. We prove that kernel smoothing regression and classification using the manifold-Hilbert kernel are weakly consistent in the setting of Devroye et al. [19]. For the sphere, we show that the manifold-Hilbert kernel can be realized as a weighted random partition kernel, which arises as an infinite ensemble of partition-basedclassifiers.
more » « less
Full Text Available
VC DIMENSION OF PARTIALLY QUANTIZED NEURAL NETWORKS IN THE OVERPARAMETRIZED REGIME

Wang, Yutong; Scott, Clayton (January 2022, International Conference on Learning Representations)

Vapnik-Chervonenkis (VC) theory has so far been unable to explain the small generalization error of overparametrized neural networks. Indeed, existing applications of VC theory to large networks obtain upper bounds on VC dimension that are proportional to the number of weights, and for a large class of networks, these upper bound are known to be tight. In this work, we focus on a subclass of partially quantized networks that we refer to as hyperplane arrangement neural networks (HANNs). Using a sample compression analysis, we show that HANNs can have VC dimension significantly smaller than the number of weights, while being highly expressive. In particular, empirical risk minimization over HANNs in the overparametrized regime achieves the minimax rate for classification with Lipschitz posterior class probability. We further demonstrate the expressivity of HANNs empirically. On a panel of 121 UCI datasets, overparametrized HANNs match the performance of state-of-the-art full precision models.
more » « less
Full Text Available
VC dimension of partially quantized neural networks in the overparametrized regime.

Wang, Yutong; Scott, Clayton (January 2022, ICLR 2022)

Vapnik-Chervonenkis (VC) theory has so far been unable to explain the small generalization error of overparametrized neural networks. Indeed, existing applications of VC theory to large networks obtain upper bounds on VC dimension that are proportional to the number of weights, and for a large class of networks, these upper bound are known to be tight. In this work, we focus on a subclass of partially quantized networks that we refer to as hyperplane arrangement neural networks (HANNs). Using a sample compression analysis, we show that HANNs can have VC dimension significantly smaller than the number of weights, while being highly expressive. In particular, empirical risk minimization over HANNs in the overparametrized regime achieves the minimax rate for classification with Lipschitz posterior class probability. We further demonstrate the expressivity of HANNs empirically. On a panel of 121 UCI datasets, overparametrized HANNs match the performance of state-of-the-art full-precision models.
more » « less
Full Text Available

« Prev Next »

Search for: All records